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Abstract 

A new lattice protein model with a four-helix bundle ground state is ana- 
lyzed by a parameter-space Monte Carlo histogram technique to evaluate the 
effects of an extensive variety of model potentials on folding thermodynamics. 
Cooperative helical formation and contact energies based on a 5-letter alpha- 
bet are found to be insufficient to satisfy calorimetric and other experimental 
criteria for two-state folding. Such proteinlike behaviors are predicted, how- 
ever, by models with polypeptide-like local conformational restrictions and 
environment-dependent hydrogen bonding-like interactions. 

PACS Numbers : 87.15.Aa, 87.15. Cc, 87.15.He, 87.15.By 
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Proteins are complex systems. Insight into their behaviors has been gained by simplified 
models of generic proteins [1-6] . To serve as stepping stones towards an elucidation of the 
physics of real proteins, however, these models must be subjected to rigorous evaluations 
against experiments. Recently, we found that a number of popular lattice protein mod- 
els with 2, 3, and 20 residue types and pairwise additive contact energies do not satisfy 
the experimental criteria for two-state thermodynamic cooperativity which, among other 
conditions, requires a protein's van't Hoff to calorimetric enthalpy ratio AH v n/ AH ca \ m 1 
[7,8]. While certain Go models are cooperative, they do not address the physical nature of 
protein interactions since their potentials are teleological [8] in the sense that the energetic 
favorability of the native conformation as a whole is presupposed. 

It has since been proposed that a cooperative interplay between local conformational 
preferences and nonlocal interactions could give rise to proteinlike thermodynamics [7]. To 
evaluate the viability of this scenario, we introduce in this Letter a 55-mer chain model with a 
four- helix bundle ground state (Fig. 1) that shares common features with the corresponding 
protein motif [9]. The cubic-lattice helices are right-handed (called type (i) in Ref. [2]), as 
are most a-helices. The model has the following energetic components. (In the analysis 
below all energies are dimensionless and temperature independent.) 

Contact energies. A reduced 5-letter alphabet (Fig. 1) identical to the one recently opti- 
mized [5] is used for nearest-neighbor interactions, with energy parameters from Table III of 
Kolinski et al. [10]. While these energies are proteinlike to some extent because hydropho- 
bic groups are placed in the native core (Fig. 1), they do not represent the full interactions 
between amino acid residues [10]. Here they are adopted to capture heterogeneous aspects 
of intraprotein interactions [11]. 

Unfavorable local conformations. Two types of local non-proteinlike bond geometries are 
discouraged: the initiation of a left-handed (lh) helix (Fig. 2a), and one end of a helix taking 
a sharp turn (st) to fold back onto itself (Fig. 2b) are assigned unfavorable (> 0) energies 
to take into account that in real proteins left-handed a-helices are sterically disfavored, and 
that polypeptides are stiffer than a fully flexible chain [12]. 
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Environment- dependent hydrogen bonding. The favorable many-body interactions (£nb < 
0) in Fig. 2c, d are introduced to explore an idea, suggested by experiments [13], that the 
collective strength of hydrogen bonds is stronger when they are buried in the core of a 
protein than when they are exposed to water. Hence we focus mainly on b > 1 cases below. 
Analogous interactions have been used before [7,14]. 

Cooperative helical propagation. An extra favorable energy is assigned to every two 
consecutive helical turns (Fig. 2e) to encourage helix elongation. Such an effect may arise 
from dipole-dipole interactions between amide groups in real a-helices [12]. 

The total energy of a conformation from these contributions is 

E = Contact + 7lhiVl h + 7 stiV s t + ^Hb^Hb + ^Hb^iS + <?Helix^ H clix , (1) 

where -E con tact is the sum of 5-letter contact energies, iV lh and N st are respectively the numbers 
of all incidences of Fig. 2a,b. In the present analysis, A^eiix only counts those helices (Fig. 2e) 
that are parts of the four helices in the ground-state conformations (monomers 1-12, 15-26, 
30-41, and 44-55; see Fig. 1); and the hydrogen bonding pairs counted by (Fig. 2c) 
and TVjjb (Fig. 2d) are the 36 (i,i + 3) or + 5) contacting monomer pairs in the native 
helices. A first-principle treatment would have assigned hydrogen bonds and helical segments 
in a manner that do not require knowledge of the native structure. However, progress can 
nonetheless be made by the approach taken here, which seeks, as a first step in the inquiry, to 
ascertain the consequence of presupposing local native preferences, a presupposition that is 
notably less dependent on a priori knowledge than the assumption of global native preference 
in Go potentials. 

To efficiently determine how thermodynamic properties vary with the model energetic 
parameters, we use a generalization [3,15] of the standard Metropolis Monte Carlo his- 
togram technique [8,16] to eliminate the need to perform separate direct simulations for 
every parameter set of interest. Typical simulations are carried out with £ H b = ^Hciix = 
at a certain temperature T', during which numbers P of sampled conformations are binned 
into a multiple-dimensional array (histogram) according to (E 1 , , N^, A^ He iix), where 
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E' is the energy of the conformation. Thus the density of states of the simulated system 
g(E>, Ngl N§lN Kelix ) = P(E', N®,N$, N Kelix )e E '/ k * T ' , where k B is Boltzmann's constant. 
It follows that the partition function for any Snh, £ue\ix at any temperature T is given by 
Eg(E',N$,N®,N 1Mix )e- E / k » T , where the summation is over E', N®, N®, and iV Helix , 
and E = E' + £- Hb (N^ + bN^) + ^hcHx^VhcHx- A similar procedure is used to study the 
effects of 7m and 7 st . 

Each Monte Carlo run consists of 1.53 x 10 9 attempted moves, the first 3 x 10 7 of which 
are excluded from data acquisition. To generate a multiple-dimensional histogram, a total of 
20 runs at 10 different simulation T's (around the transition region) are performed, with two 
different random initial conformations for each T'. Values of k 2 estimated from different T's 
agree well, with standard deviation 4%. We have conducted extensive comparisons with 
direct simulations to validate the method [17]. No energy lower than that of the structure 
in Fig. 1 has been encountered. 

As in experimental calorimetry [18], we characterize the thermodynamic cooperativ- 
ity of a model protein by its specific heat capacity Cp and AH v n/AH ca ,i. When base- 
line subtraction is not applied to the Cp function, AH v yi/ AH ca \ may be equated to k 2 = 
2T max ^/k B C P (T max )/AH cah where A# cal = / °° dTC P {T) is the calorimetric enthalpy, and 
Cp(T) is maximum at T = T max [8]. Baseline subtractions amount to defining a multi- 
conformation native state and ignoring a part of the enthalpic variation in the denatured 
ensemble. This effectively reduces both the calorimetric enthalpy and the maximum heat 
capacity value, resulting in a modified enthalpy ratio (> k 2 ) [8]. 

Our main findings are summarized in Figs. 3 and 4. The apparent AH v n/ AH ca ,\ = 
after empirical baseline subtractions can often be close or equal to unity even when n 2 is low 
(Fig. 4). However, as is the case for a 3-letter model, a large discrepancy between k 2 and 
k 2 is often symptomatic of non-proteinlike significant post-denaturational chain expansion 
at T> T max [8]. Therefore, for model evaluation, proteinlike thermodynamic cooperativity 
requires both a small k$ — k 2 and ~ 1- 

Fig. 3 compares three models by this criterion. The least cooperative is a flexible chain 
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model with only pairwise additive contact energies. A second model with polypeptide-like 
local sterics has slightly enhanced cooperativity because populations of nonnative confor- 
mations with non-proteinlike local geometries that are hitherto favorable in a fully flexible 
chain model are reduced. However, both of these models are not proteinlike because of their 
significant post-denaturational chain expansions, as is evident from their thick denatured 
Cp "tails" at T ^ T max (Fig. 3), which account for these models' relatively large differences 
between and k 2 [8]. On the other hand, the model that also incorporates environment- 
dependent hydrogen bonding has more proteinlike thermodynamics: Its native Cp tail at 
T <C T max is thin, with Cp values lower on average than that of its denatured tail (Fig. 3). 
This implies that its native conformational diversity is limited, thus conforming better to 
NMR data [19] than a previously considered 20- letter model [8]. As for real proteins [20], its 
average radius of gyration undergoes a sharp change around T max , but has no appreciable 
post-denaturational increase (data not shown). 

Fig. 4 surveys a range of energetic parameters. Remarkably, local helical cooperativity 
has only a small impact on overall folding cooperativity, and proteinlike thermodynamics is 
possible at £hcUx = 0. While £ Helix < stabilizes the native state, it also stabilizes dena- 
tured conformations with partially intact native helices. Therefore, its effect on calorimetric 
cooperativity is not substantial because it cannot widen the average enthalpy difference be- 
tween native and denatured states significantly [7,8]. In contrast, calorimetric cooperativity 
increases sharply with more negative £nh- For 0.5 < b < 1.5, this effect is not very sensitive 
to b. For example, £nb = —0.5, £h c Hx = 0, b = 0.5 (and 7m = 6.0, 7 st = 5.0) lead to k 2 , 
k 2 ^ = 0.85, 0.99, which are only slightly lower than the re 2 , i$ values of 0.90, 1.0 for the 
same £nb and ^Heiix in Fig. 4 for b = 1.5. 

In addition, we found that a lattice version of helix capping [21] has a slight attenuating 
effect on cooperativity [17]. Consistent with experiments [22], in our model, tertiary inter- 
actions are essential in stablizing helices in native structures. When folded independently, 
the two 12-mer sequences for the native helices at 1-12 and 15-26 are much less stable 
(Tmax ~ 0.3), and their thermal transitions are not calorimetrically cooperative. 
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Figs. 3 and 4 suggest that a cooperative interplay between local conformational pref- 
erences and nonlocal contact interactions, as exemplified by the environment-dependent 
hydrogen bonding in the model, is a viable mechanism for proteinlike thermodynamics; and 
that the required cooperative effect (&£ H b = —0.8) need not be exceedingly strong relative 
to the pairwise contact energies (average magnitude = 0.66). This observation is consis- 
tent with a previous high-coordination lattice model study [23], though the latter did not 
address the calorimetric criterion. The present approach did not consider non-native hy- 
drogen bonding. Cooperativity in the present model would be reduced if such non-native 
conformations are favored. Further investigations using continuum models with polypeptide 
chain geometry are necessary to ascertain whether real proteins can have substantial num- 
ber of non-native hydrogen bonds. While the present model should be regarded as tentative 
because it relies on local native information, its proteinlike features do not follow trivially 
from part of its interactions' native-centric nature per se. Important physical principles have 
emerged from our analysis because not all native-centric interaction schemes can bring about 
comparable enhancements in thermodynamic cooperativity: (i) The fact that £hcU x < is 
neither necessary nor sufficient for proteinlike thermodynamics suggests that multi-body 
interactions that favors local native conformation irrespective of tertiary packing cannot 
account for calorimetric cooperativity. (ii) A Go model for the ground-state conformations 
in Fig. 1, which exploits both local and nonlocal native information but is based exclusively 
on pairwise additive interactions, has k 2 = 0.73 and is thus less cooperative than the model 
with k,2 = 0.91 in Fig. 3. (in) Proteinlike steric effects contribute to cooperativity. If non- 
proteinlike local conformations were not disfavored in the latter model (i.e., if 7ih = 7st = 0), 
k 2 would be reduced to 0.85. 

We have thus mapped out a general investigative strategy and established the viability 
of a folding scenario. It should be emphasized, however, that satisfying the requirements 
for thermodynamic cooperativity is clearly necessary but not sufficient for the validity of 
a scenario's underlying physical mechanisms. Whether hydrogen bonding is favorable to 
native stability remains controversial [13,24]. The present choice of £nb < is motivated 
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by experiments [13]. But there have been theoretical suggestions that hydrogen bonding 
disfavors the folded state of a protein [24]. In the present modeling framework, that would 
be detrimental to calorimetric cooperativity as it corresponds to £nb > 0, leading to k 2 s even 
smaller than the £ H b = case (Fig. 3). For instance, when £ H b = +0-1 and fheiix = 0, (and 
7lh = 6.0, 7 s t = 5.0), « 2 , 4 S) = 0.51, 0.80 for b = 0.5 and « 2 , 4° = 0.45, 0.70 for b = 1.5, 
respectively. If that turns out to be the case, there would be added impetus to extend the 
present method to ascertain the role of other mechanisms such as sidechain packing [4,8,25] 
in protein calorimetric two-state cooperativity. 

This work was supported by Medical Research Council of Canada grant MT-15323. 
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Figure Captions 



Fig. 1. Ground state conformations of the model. Black beads denote nominally hy- 
drophobic monomers. The numbers label selected sequence positions. Each of the 3 short 
loops at positions (13, 14), (27, 28, 29), and (42, 43) has two iso-energetic local conforma- 
tions. Thus the ground state has 8 conformations, one of which is shown. 

Fig. 2. Energetic components of the model. (a,b) Each contact marked by a double ar- 
row is assigned an energy (as shown). The dotted lines in (b) depict an alternate path of the 
chain from monomer i — 3 to i that has one instead of two unfavorable contacts. Similarly, 
an energy is associated with each buried hydrogen bond (c,d) and each occurrence of two 
consecutive turns (layers) of a right-handed lattice helix (e). In (c,d), hydrogen bonds are 
represented as ladders linking pairs of encircled monomers. Their burial requires occupation 
of at least 6 of their neighbor sites. The energy of a completely buried bond with 8 occupied 
neighbor sites (d) can be stronger (b > 1) or weaker (b < 1) than a partially buried bond 
with 6 or 7 occupied neighbor sites. The latter two cases have the same energy £nb and are 
accounted for collectively by in Eq. (1). 

Fig. 3. Specific heat capacity functions. In all three cases shown, £hcUx = 0. T max s are 
marked by vertical lines. /,From left to right, the first model has only the 5- letter pairwise 
contact energies (7m = 7st = ^Hb = 0). In addition to these, the second model incorporates 
the repulsive interactions in Fig. 2a,b, with 7^ = 6.0 and 7 st = 5.0. The third model further 
adds the favorable hydrogen-bonding energies in Fig. 2c, d, with £ H b = —0.53, 6£ H b = —0.8, 
and b = 1.5. k 2 = 0.55, 0.62, and 0.91, and after the plotted inclined baselines are sub- 
tracted, k\ = 0.93, 0.93, and 1.01, respectively. 

Fig. 4. Calorimetric cooperativity as a function of local helical cooperativity and 
environment-dependent hydrogen bonding strength, with b = 1.5, 7^ = 6.0 and 7 st = 5.0. 
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k~2 an d ^2 are given by the upper and lower surfaces, respectively. The black dots con- 
nected by a vertical bar mark the parameters for the most cooperative case in Fig. 3. 
are calculated using empirical baselines [8] constructed as tangents of the Cp function at 
C P (T)/C P (T max ) = -0.001, where C' P = d 2 C P /dT 2 (see Fig. 3). 
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